I am graphing several columns of a large array of data (through numpy.genfromtxt) against an equally sized time column. Missing data is often referred to as nan, -999, -9999, etc. However I can't figure out how to remove multiple values from the array. This is what I currently have:
for cur_col in range(start_col, total_col):
# Generate what is to be graphed by removing nan values
data_mask = (file_data[:, cur_col] != nan_values)
y_data = file_data[:, cur_col][data_mask]
x_data = file_data[:, time_col][data_mask]
After which point I use matplotlib to create the appropriate figures for each column. This works fine if the nan_values is a single integer, but I am looking to use a list.
EDIT: Here is a working example.
import numpy as np
file_data = np.arange(12.0).reshape((4,3))
file_data[1,1] = np.nan
file_data[2,2] = -999
nan_values = -999
for cur_col in range(1,3):
# Generate what is to be graphed by removing nan values
data_mask = (file_data[:, cur_col] != nan_values)
y_data = file_data[:, cur_col][data_mask]
x_data = file_data[:, 0][data_mask]
print 'y: ' + str(y_data)
print 'x: ' + str(x_data)
print file_data
>>> y: [ 1. nan 7. 10.]
x: [ 0. 3. 6. 9.]
y: [ 2. 5. 11.]
x: [ 0. 3. 9.]
[[ 0. 1. 2.]
[ 3. nan 5.]
[ 6. 7. -999.]
[ 9. 10. 11.]]
This will not work if nan_values = ['nan', -999] which is what I am looking to accomplish.
user545424 :
I would suggest using masked arrays like so:\n\n>>> a = np.arange(12.0).reshape((4,3))\n>>> a[1,1] = np.nan\n>>> a[2,2] = -999\n>>> a\narray([[ 0., 1., 2.],\n [ 3., nan, 5.],\n [ 6., 7., -999.],\n [ 9., 10., 11.]])\n>>> m = np.ma.array(a,mask=(~np.isfinite(a) | (a == -999)))\n>>> m\nmasked_array(data =\n [[0.0 1.0 2.0]\n [3.0 -- 5.0]\n [6.0 7.0 --]\n [9.0 10.0 11.0]],\n mask =\n [[False False False]\n [False True False]\n [False False True]\n [False False False]],\n fill_value = 1e+20)\n",
2012-06-21T21:13:30